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1 . (Amended) A predictive action determination apparatus comprising: 
a state observation section for observing a state with respect to a predetermined 
environment and obtaining state data; 



environment; 

an environment prediction section for predicting a future state change in the 
environment, based on the state data obtained by the state observation section; 

a target state determination section for determining, as a target state, a future state 
10 suitable for action determination among future states predicted by the environment 
prediction section, based on the state value for each of future states stored in the state value 
storage section; and 

a first action determination section for determining an action of the apparatus, 
based on the target state determined by the target state determination section, 
15 wherein the environment prediction section predicts a future state change in the 

environment, which is not influenced by actions of the apparatus. 



5 



a state value storage section for storing a state value for each of states of the 



2. (Cancelled) 



20 



3. The apparatus of claim 1, wherein the target state determination section 



determines, as a target state, a future state of which a state value is maximal. 



25 



4. The apparatus of claim 1, further comprising a state value update section for 
updating, by learning, the state value stored in the state value storage section, 

wherein the target state determination section determines, as the target state, one of 
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the future states of which the state value has been already updated by the state value update 
section. 

5. (Amended) A predictive action determination apparatus comprising: 

5 a state observation section for observing a state v^th respect to a predetermined 

environment and obtaining state data; 

a state value storage section for storing a state value for each of states of the 
environment; 

an environment prediction section for predicting a future state change in the 
10 environment, based on the state data obtained by the state observation section; 

a target state determination section for determining, as a target state, a future state 
suitable for action determination among future states predicted by the environment 
prediction section, based on the state value for each of future states stored in the state value 
storage section; and 

15 a first action determination section for determining an action of the apparatus, 

based on the target state determined by the target state determination section, 

wherein the target state determination section discounts the state value obtained 
from the state value storage section according to the number of steps from a current step 
and uses the discounted state value. 

20 

6. The apparatus of claim 1, 

wherein the state value storage section stores a state value for a state including the 
apparatus, and 

wherein the apparatus further includes: 
25 a value conversion section for obtaining, based on the state value stored in the state 
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value storage section, a state value for a future state which is predicted by the environment 
prediction section and does not include the apparatus and giving the obtained state value to 
the target state determination section. 

5 7. The apparatus of claim 1 , further comprising: 

a second action determination section for determining an action of the apparatus, 
based on a predetermined action rule; and 

an action selection section for receiving actions determined by the first and second 
action determination section, respectively, as first and second action candidates and 
10 selecting one of the first and second action candidates as an actual action. 

8. The apparatus of claim 7, 

wherein the target state determination section gives a selection signal indicating 
whether or not a target state could be determined to the action selection section, and 
15 wherein if the selection signal indicates that a target state is determined, the action 

selection section selects the first action candidate while if the selection signal indicates that 
a target state could not be determined, the action selection section selects the second action 
candidate. 

20 9. The apparatus of claim 1, 

wherein the first action determination section includes: 

a state-change-with-action detection section for receiving the state data and 
detecting, from a current state indicated by the state data, a state and an action in a 
previous step; 

25 a state-change-with-action storage section for storing, as a state change, a 
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combination of the current state and the state and the action in the previous step detected 
by the state-change-with-action detection section; and 

an action planning section for searching the state-change-with-action storage 
section for a history of a state change in a period between the current state and the target 
5 state and determining an action, based on a resuh of the search. 

10. The apparatus of claim 9, wherein the action planning section performs a 
backward search in the direction from the target state to the current state when the state 
change storage section performs the search. 

10 

1 1 . (Amended) A predictive action determination apparatus comprising: 

a state observation section for observing a state with respect to a predetermined 
environment and obtaining state data; 

a state value storage section for storing a state value for each of states of the 
15 environment; 

an environment prediction section for predicting a future state change in the 
environment, based on the state data obtained by the state observation section; 

a target state determination section for determining, as a target state, a future state 
suitable for action determination among future states predicted by the environment 
20 prediction section, based on the state veilue for each of future states stored in the state value 
storage section; and 

a first action determination section for determining an action of the apparatus, 
based on the target state determined by the target state determination section, 
wherein the environment prediction section includes: 
25 a state change detection section for receiving the state data and detecting a state in a 



previous step from a current state indicated by the state data; 

a state change storage section for storing, as a state change, a combination of the 
current state and the state in the previous step detected by the state change detection 
section; and 

5 a state prediction section for predicting a state after the current state from the state 

change storage section. 

12. (Amended) A method of determining in a predictive action determination 
apparatus an action of the apparatus, comprising: 

10 a first step of observing a state with respect to a predetermined environment and 

obtaining state data; 

a second step of predicting a future state change in the environment, based on the 
obtained state data; 

a third step of determining, as a target state, a fiitvire state suitable for action 
15 determination among predicted fiiture states, with reference to the state value for each of 
the fiiture states; and 

a fourth step of determining the action of the apparatus, based on the determined 
target state, 

wherein a predicted state change is a fiiture state change in the environment, which 
20 is not influenced by actions of the apparatus. 

13. (Cancelled) 

14. The method of claim 12, 
25 wherein in the third step. 
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one of the future states of which a state value is maximal is detemiined as the target 

state. 

1 5. The method of claim 12, 

wherein the predictive action determination apparatus updates a state value for each 
of states of the environment by learning, and 

wherein in the third step, one of the future states of which a state value has been 
already updated is determined as a target state. 



